关于NLP模型的最先进攻击缺乏对成功攻击的共享定义。我们将思考从过去的工作蒸馏成统一的框架:一个成功的自然语言对抗性示例是欺骗模型并遵循一些语言限制的扰动。然后,我们分析了两个最先进的同义词替换攻击的产出。我们发现他们的扰动通常不会保留语义,38%引入语法错误。人类调查显示,为了成功保留语义,我们需要大大增加交换词语的嵌入和原始和扰动句子的句子编码之间的最小余弦相似之处。与更好的保留语义和语法性,攻击成功率下降超过70个百分点。
translated by 谷歌翻译
Point-of-Care Ultrasound (POCUS) refers to clinician-performed and interpreted ultrasonography at the patient's bedside. Interpreting these images requires a high level of expertise, which may not be available during emergencies. In this paper, we support POCUS by developing classifiers that can aid medical professionals by diagnosing whether or not a patient has pneumothorax. We decomposed the task into multiple steps, using YOLOv4 to extract relevant regions of the video and a 3D sparse coding model to represent video features. Given the difficulty in acquiring positive training videos, we trained a small-data classifier with a maximum of 15 positive and 32 negative examples. To counteract this limitation, we leveraged subject matter expert (SME) knowledge to limit the hypothesis space, thus reducing the cost of data collection. We present results using two lung ultrasound datasets and demonstrate that our model is capable of achieving performance on par with SMEs in pneumothorax identification. We then developed an iOS application that runs our full system in less than 4 seconds on an iPad Pro, and less than 8 seconds on an iPhone 13 Pro, labeling key regions in the lung sonogram to provide interpretable diagnoses.
translated by 谷歌翻译
实现安全和强大的自主权是通往更广泛采用自动驾驶汽车技术的道路的关键瓶颈。这激发了超越外在指标,例如脱离接触之间的里程,并呼吁通过设计体现安全的方法。在本文中,我们解决了这一挑战的某些方面,重点是运动计划和预测问题。我们通过描述在自动驾驶堆栈中解决选定的子问题所采取的新方法的描述,在介绍五个之内采用的设计理念的过程中。这包括安全的设计计划,可解释以及可验证的预测以及对感知错误的建模,以在现实自主系统的测试管道中实现有效的SIM到现实和真实的SIM转移。
translated by 谷歌翻译
自动驾驶汽车使用各种传感器和机器学习型号来预测周围道路使用者的行为。文献中的大多数机器学习模型都集中在定量误差指标上,例如均方根误差(RMSE),以学习和报告其模型的功能。对定量误差指标的关注倾向于忽略模型的更重要的行为方面,从而提出了这些模型是否真正预测类似人类行为的问题。因此,我们建议分析机器学习模型的输出,就像我们将在常规行为研究中分析人类数据一样。我们介绍定量指标,以证明在自然主义高速公路驾驶数据集中存在三种不同的行为现象:1)运动学依赖性谁通过合并点首次通过合并点2)巷道上的车道更改,可容纳坡道车辆3 )车辆通过高速公路上的车辆变化,以避免铅车冲突。然后,我们使用相同的指标分析了三个机器学习模型的行为。即使模型的RMSE值有所不同,所有模型都捕获了运动学依赖性的合并行为,但在不同程度上挣扎着捕获更细微的典型礼貌车道变更和高速公路车道的变化行为。此外,车道变化期间的碰撞厌恶分析表明,模型努力捕获人类驾驶的物理方面:在车辆之间留下足够的差距。因此,我们的分析强调了简单的定量指标不足,并且在分析人类驾驶预测的机器学习模型时需要更广泛的行为观点。
translated by 谷歌翻译
我们为异质电子健康记录(EHR)数据开发了无监督的概率模型。利用混合模型公式,我们的方法直接建模了任意长度的序列,例如药物和实验室结果。这允许亚组和掺入基础异质数据类型的动力学。该模型由一组层次的潜在变量集组成,这些变量编码数据中的基础结构。这些变量代表顶层的主题亚组,而在第二层中的序列未观察到的状态未观察到。我们在Kaiser Permanente North California综合医疗保健提供系统中接受了接受医疗服务的受试者的情节数据训练该模型。训练有素的模型的最终属性从这些复杂和多方面的数据中产生了新的见解。此外,我们还展示了该模型如何用于分析有助于评估死亡率可能性的序列。
translated by 谷歌翻译
在交通场景中的道路使用者的运动预测对于必须在复杂的动态环境中采取安全和强大决策的自动驾驶系统至关重要。我们提出了一种新型的运动预测系统,用于自动驾驶。我们的系统基于贝叶斯逆计划框架,该框架有效地精心策划了基于地图的目标提取,基于经典的基于控制的轨迹发生器以及专家集合轻巧神经网络的混合物,专门针对运动概况预测。与许多替代方法相反,这种模块化有助于隔离性能因素并更好地解释结果,而不会损害性能。该系统解决了感兴趣的多个方面,即多模式,运动概况不确定性和轨迹物理可行性。我们报告了流行的高速公路数据集NGSIM的几个实验,这在轨迹误差方面证明了最先进的性能。我们还对系统组件进行了详细的分析,以及基于行为(例如变更车道与跟随车道)对数据进行分层的实验,以提供对该域中挑战的见解。最后,我们提出了定性分析,以显示我们方法的其他好处,例如解释产出的能力。
translated by 谷歌翻译
由于极大数量的参数和评估标准和再现性,机器学习长期以来被视为黑盒子,用于预测燃烧化学动力学和缺乏评估标准和再现性。目前的工作旨在了解关于深度神经网络(DNN)方法的两个基本问题:DNN需要的数据以及DNN方法的一般数据。采样和预处理确定DNN训练数据集,进一步影响DNN预测能力。目前的工作建议使用Box-Cox转换(BCT)来预处理燃烧数据。此外,这项工作比较了在没有预处理的情况下进行了不同的采样方法,包括蒙特卡罗方法,歧管采样,生成神经网络方法(Cycle-GaN)和新提出的多尺度采样。我们的研究结果表明,通过歧管数据训练的DNN可以以有限的配置捕获化学动力学,但不能对扰动牢固,这对于与流场联系的DNN是不可避免的。蒙特卡罗和循环甘套采样可以覆盖更宽的相位空间,但不能捕获小规模的中间物种,产生差的预测结果。基于没有特定火焰仿真数据的多尺度方法的三层DNN,允许在各种场景中预测化学动力学并在时间的演变期间保持稳定。该单个DNN易于用几个CFD代码实现并在各种燃烧器中验证,包括(1)。零维自动化,(2)。一维自由传播火焰,(3)。具有三重火焰结构的二维喷射火焰,和(4)。三维湍流升降火焰。结果证明了预先训练的DNN的令人满意的准确性和泛化能力。 DNN和示例代码的FORTRAN和PYTHON版本在补充中附加了再现性。
translated by 谷歌翻译
We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.
translated by 谷歌翻译
Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译